Portability of syntactic structure for language modeling
نویسنده
چکیده
The paper presents a study on the portability of statistical syntactic knowledge in the framework of the structured language model (SLM). We investigate the impact of porting SLM statistics from the Wall Street Journal (WSJ) to the Air Travel Information System (ATIS) domain. We compare this approach to applying the Microsoft rule-based parser (NLPwin) for the ATIS data and to using a small amount of data manually parsed at UPenn for gathering the intial SLM statistics. Surprisingly, despite the fact that it performs modestly in perplexity (PPL), the model initialized on WSJ parses outperforms the other initialization methods based on in-domain annotated data, achieving a significant 0.4% absolute and 7% relative reduction in word error rate (WER) over a baseline system whose word error rate is 5.8%; the improvement measured relative to the minimum WER achievable on the N-best lists we worked with is 12%.
منابع مشابه
Syntactic Properties of Language of Scientific Communication in Persian Scientific Works
Purpose: The language of science is one of the social types of Persian language, which is used by the educated classes in scientific works and contexts. The purpose of this research is to present an overall picture of the syntactic properties of the Persian scientific language. The types of sentences, types of tenses, verb tenses, and syntactic constructions have been identified in the scientif...
متن کاملGender-Based investigation of the Syntactic Development of Iranian EFL Learners: A Focus on Processabilty Theory
Pienemann (1998, 2015) put forward Processability Theory to enlighten why language learners follow definite developmental paths. The aim of the present study was to run a comparative investigation into the difficulty order of different grammatical structures for male and female Iranian EFL learners predicted by Processability Theory. 185 Iranian university students took part in this study. They...
متن کاملTextuality of Idiomatic Expressions in Cameroon English
The meaning of an idiomatic expression cannot be transparently worked out from the meanings of its constituent words due to its figurative and unpredictable nature. Consequently, the syntactic composition and the structural paradigm of an idiomatic expression are supposed to be the same in every context. However, this is not the case in the institutionalized second language varieties of English...
متن کاملA syntactic-semantic analysis of \"منصوب به نزع خافض\"based on the Holy Quran
One of important issues in the field of implication and aggression is "منصوب به نزع خافض". It is an idiom related to مفعول به "”. By referring to its definition, a syntactic-semantic analysis will be done in this paper. It tries to indicate what is the relationship between word and meaning and to what extent Arabic syntax focu...
متن کاملSyntactic Structures in Research Article Titles from Three Different Disciplines: Applied Linguistics, Civil Engineering, and Dentistry
Deducing what a paper is about, titles are considered as the most important determinant of how many people will read the article. Therefore, studying the use of different syntactic structures and their rhetorical functions in titles is of great significance. The current study was set to investigate these structures used in research article titles in three disciplines of Applied Linguistics, Den...
متن کامل